BREGX is a demonstration program using AWK to process *binary* files. The source code is specific to the Thompson AWK compiler and Instant AWK versions, but may contain ideas useful in other implementation of the language. Since all files are fundamentally binary, the program can be used on ANY type of file. Limits: The largest file that can be processed is the smallest of: the amount of disk space available, 2^32 bytes, or the largest that the operating system can support. The largest replacement text is the smaller of the maximum length you can get on the command line or 128 characters. The pattern has the same limits as the replacement string. The smallest file that can be processed is 1 byte; the program will not attempt to read empty files. SYNTAX EXAMPLES: BREGX {pattern} {string} {source} {target} [offset] [count] BREGX {pattern} {source} /f [offset] [count] where {pattern} is the regular expression to search for; {string} is the string literal to substitute for the pattern; {source} is the source file; {target} is the name of a file to write the processed data to; offset is the number of bytes to skip over before beginning the processing (a HEX number in the format /0xnnn); count is the maximum number of replacements to make. SWITCHES: /0xnn sets the number of bytes to skip over before any processing begins. /a causes the program to write the offsets of matches/substitutions to STDOUT. /cnn sets the upper limit on the number of matches/substitutions allowed. /f puts the program in find-only mode. There is no output file, but the offsets of the matches are written to STDOUT. This switch forces the /a switch also. ERRORLEVELS: The program returns an ERRORLEVEL to the calling batch file or program: Termination - 0 - normal termination Abend - 1 - no arguments 2 - unrecognized switch 3 - user aborted at "Replace target?" prompt 4 - excessive number of arguments 5 - replace mode chosen, but either replacement string or target file argument (or both) missing 6 - source file or pattern argument missing (or both) 7 - source file does not exist or is unreadable 8 - source file is empty 9 - target file cannot be opened for write 10 - defective /cnn argument 11 - the pattern matches the replacement string This program is presented primarily as example code, with an executable version for those without access to the required compiler. However, there is no reason why it can't be used as a general purpose find/replace tool. The program has not been extensively tested, so some bugs are to be expected, as is non-optimum coding of some routines. OK, so I've got the program, now, what use is it? BREGX \xcc \xdd {icon1} {icon2} /0x80 copies an icon file while changing blue to magenta. BREGX {pattern} {patch} {exe1} {exe2} /c1 /0x400 applies a patch to an executable at the first match after the 1024th byte. And so forth. BINARY REGULAR EXPRESSIONS: Regular expressions are much more powerful than wild cards - they allow quite sophisticated matches. In binary regexs, it is necessary to express non-typable characters with escape strings of the form \xnn, where nn is the value of the byte in hex. In this particular program, the beginning of string and end of string meta characters (^ and $) are of little use. For files of less than 4096 bytes, they match the beginning and end of the file, but for longer files, they match the beginning and end of each file read cycle, in a fairly complex way due to an overlap of 128 bytes in the reads. In general: * matches zero or more occurrence of the preceding character ? matches zero or one occurrence of the proceeding character + matches one or more occurrences of the preceding character . matches any single character [] enclose lists or ranges of permissible matches ( [aA] matches either 'a' or 'A' [0-9] matches any digit. ^ in some positions matches everything *except* the following class or character | is the OR operator. () are used to group sub patterns. \xnn is a byte value in hex \\ is the backslash character \+ is the plus sign character etc. There is no intrinsic relationship between the pattern and replacement string. They do not have to be the same length, or have anything in common. "" is the null replacement; quote marks and several other characters must either be escaped with a preceding backslash or given as byte values or they will be interpreted by the command processor before they are passed to the program. How it works: Review the source code for this information. BREGX.AWK OVERHEAD.FCT PARSER.FCT ENGINES.FCT ERRHNDL.FCT MAKE.BAT will compile the program, provided that you have the Thompson AWK compiler and that %AWK% points to it. BREGX.EXE is the stand-alone executable compiled from the supplied source code using version 2.03a of the AWK compiler supplied by Thompson Automation Software (800) 944-0139 (The compiler is excellent, but a bit pricey.) The program writes most of its messages to STDERR, except for the list of offsets and the closing summary. You might want to change some or all of that. LICENSE and LEGAL stuff: This program has been placed in the PUBLIC DOMAIN by the author, on 6 March 1994. The Author advises you that the program is provided AS-IS, complete with probable bugs. Using this program signifies YOUR acceptance of all responsibility for its behavior. Ted Davis [73500,2314] 6 March 1994